feat(backup): incremental backup system for ~/.obk data#114
Merged
priyanshujain merged 49 commits intomasterfrom Mar 22, 2026
Merged
feat(backup): incremental backup system for ~/.obk data#114priyanshujain merged 49 commits intomasterfrom
priyanshujain merged 49 commits intomasterfrom
Conversation
…ckup paths Adds config types for the backup system: destination (r2/gdrive), schedule, and credential refs. Also adds BackupDir, BackupStagingDir, and BackupLastManifestPath helper functions.
Defines the storage backend interface (Put/Get/Head/List/Delete) and a LocalBackend for testing against the local filesystem.
Manifest tracks file hashes, sizes, and compressed sizes per snapshot. DiffManifest compares against previous to find changed/removed files.
Walks ~/.obk and matches files against configurable include patterns (data.db, config files, learnings, skills) while excluding WAL files, logs, scratch dirs, and the backup dir itself.
Uses VACUUM INTO to create a consistent point-in-time copy of each .db file into a staging directory, avoiding WAL corruption issues.
Implements the main backup pipeline: scan files, VACUUM INTO for .db files, SHA-256 hash, diff against last manifest, zstd compress and upload changed files, then save new manifest.
Implements the Backend interface for Cloudflare R2 (S3-compatible). Includes ValidateR2() for connection testing during setup.
Implements Backend interface for Google Drive using the drive.file scope. Includes FindOrCreateDriveFolder() for setup.
Restore downloads objects from a snapshot, decompresses them, and writes to ~/.obk. ListSnapshots and GetManifest support the CLI.
Adds "Backup" to the source multi-select. The setup flow prompts for R2 or Google Drive destination, validates connection, stores credentials in keychain, and lets user pick a backup schedule.
Adds Backup section with enabled, destination, schedule fields plus R2 (bucket, endpoint, access key, secret key) and Google Drive (folder ID) sub-categories. Credentials use keyring via TypePassword.
Adds `obk backup now|list|status|restore` subcommands. The resolve helper creates the appropriate backend from config + keyring.
Registers BackupWorker with River and adds a periodic job based on the configured backup schedule (6h/12h/24h). Skips if backup is not enabled or not linked.
Tests manifest diff, load/save, file scanner include/exclude rules, local backend CRUD operations, full backup flow with hashing and compression, object key generation, and compress/decompress round-trip.
Updates expected node count from 6 to 7 and adds "Backup" to the expected labels list.
VacuumInto now takes relPath to preserve directory structure in staging (gmail/data.db and whatsapp/data.db no longer collide). Service now stores manifestPath and stagingDir as fields, with NewWithPaths() for tests to inject temp dirs.
Adds 17 tests covering: Service.Run() full flow, incremental backup, Restore with file verification, ListSnapshots empty/populated, GetManifest, VacuumInto with real SQLite, VacuumInto no-collision for same-named DBs in different dirs, Run with SQLite VACUUM INTO + restore round-trip, and expanded scanner/backend edge cases. Coverage: 26.8% → 51.7% (remaining 0% is R2/GDrive requiring infra).
backup.enabled Set() now refuses to enable unless the active destination has all required fields (R2: bucket, endpoint, both keys; GDrive: folder ID). R2/GDrive fields are ReadOnly when not the active destination. Schedule is ReadOnly until a destination is configured. Adds 12 tests covering all validation paths.
…ination authenticated - backup.enabled is ReadOnly (greyed out) until destination is fully configured with credentials; always editable when already enabled so user can disable - Changing backup.destination resets enabled to false (forces re-validation) - backup.schedule is ReadOnly until backup is enabled (not just configured) - Added tests: ReadOnly guards, destination change resets enabled, same-value destination keeps enabled
…verify → schedule When user clicks "Enabled" or "Destination" in backup settings, a wizard starts automatically (like the LLM profile wizard): 1. Select destination (R2 or Google Drive) 2. Enter credentials (R2: bucket/endpoint/keys, GDrive: folder ID) 3. Verify connection (R2 tested via S3 API) 4. Select schedule 5. Auto-enable and save No more individual field editing — the wizard chains all steps.
…atically Instead of asking for a raw folder ID, the GDrive backup wizard now: 1. Asks for a folder name (default "obk-backup") 2. Runs Google OAuth flow (opens browser for authentication) 3. Creates or finds the folder in Google Drive 4. Sets folder ID automatically Added SetupGDrive callback to settings.Service, wired up in settings_cmd.go with the full OAuth + FindOrCreateDriveFolder flow.
…play The folder ID is an internal value that means nothing to users. It's set automatically by the wizard/setup flow. No reason to show or edit it.
R2 credential fields now explain where to find each value on the Cloudflare Dashboard. GDrive setup goes straight to OAuth with default "obk-backup" folder name (no unnecessary prompt).
R2 sub-category only appears in the settings tree when destination is set to "r2". Tree rebuilds after any backup field edit so the category appears/disappears dynamically.
R2 fields are now hidden (not just read-only) when destination != r2. Tests verify fields are absent when gdrive, present when r2.
Settings service now supports a triggerBackup callback for running a backup after config changes, and exposes IsBackupDestConfigured to check if a destination has valid credentials.
- Remove schedule step from wizard (default to 6h) - All config changes are transactional: revert on Esc or failure - Destination change: if already authenticated, swap immediately; otherwise run auth flow, revert if incomplete - First-time wizard skips destination picker if dest already set - Trigger backup when stale after: wizard complete, dest swap, or re-enable toggle
Resolves R2 or GDrive backend and runs backup synchronously when the TUI determines a backup is stale after config changes.
Default to 6h schedule. Users can change it later in obk settings.
Verify R2 and GDrive destination configuration checks for partial and complete config states.
Tests cover: deep clone, schedule parsing, rollback, wizard state transitions, destination swap config, save defaults.
VACUUM INTO doesn't support parameterized queries, so a filename containing a single quote would break or inject into the SQL statement.
Shows file count and target directory, asks for y/N confirmation. Use --force to skip for scripting.
A folder or file name containing a single quote (e.g. "it's-a-backup") would break the Drive API query. Escape with backslash per Drive API spec.
…tion Backend resolution logic was duplicated in daemon/jobs, cli/backup, and cli/settings_cmd. Now all three use backupsvc.ResolveBackend with injected credential resolver and Google client factory.
Two backups within the same second (e.g. daemon + manual) would produce the same ID and overwrite each other. Appending 4 random bytes ensures uniqueness.
…o memory compressFile now uses io.Copy from the file to the zstd writer, avoiding loading the entire file contents into memory before compression.
…kage backupDest was defined identically in both settings/registry.go and internal/settings/tui/backup_wizard.go. Export it from settings and have tui call settings.BackupDest.
filepath.Walk follows symlinks, which could cause infinite recursion if a symlink points to a parent directory. Switch to filepath.WalkDir and skip symlink entries.
…backend Head and Delete used strings.Contains(err, "not found") which is fragile. Define errNotFound sentinel and use errors.Is for matching.
Verifies that Restore returns an error when the backend objects referenced in the snapshot manifest have been deleted.
Shows human-readable local time alongside the snapshot ID instead of just dumping the raw ID.
Hostname isn't useful in a personal tool. Replace with time-since-last backup and show timestamp in local timezone instead of UTC.
Adds modes, new LLM providers, services, daemon, and storage layers.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements an incremental backup system (Issue #78) that uploads changed files from
~/.obkto Cloudflare R2 or Google Drive.VACUUM INTOto create consistent point-in-time copies without WAL corruptionobk setupflow for configuring R2 or Google Drive with credential validationobk backup now|list|status|restore <snapshot-id>Remote storage layout
Files added/modified
service/backup/— Core service: backend interface, manifest, scanner, vacuum, R2, GDrive, restoreconfig/config.go— BackupConfig, R2Config, GDriveConfig structsconfig/paths.go— BackupDir(), BackupStagingDir(), BackupLastManifestPath()settings/registry.go— Backup category with field-level validation and ReadOnly guardsinternal/cli/setup.go— Setup wizard for R2 and Google Driveinternal/cli/backup/— CLI subcommandsdaemon/jobs/backup.go— River periodic workerdaemon/river.go— Worker registration + periodic job schedulingTest plan
go test ./...)obk setup→ configure R2 →obk backup now→obk backup list→obk backup restoreobk setup→ configure Google Drive → same flow